perm filename A02.TEX[154,RWF] blob
sn#856865 filedate 1988-05-05 generic text, type C, neo UTF8
COMMENT ⊗ VALID 00002 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002 \magnification\magstephalf
C00022 ENDMK
C⊗;
\magnification\magstephalf
\input macro.tex
\def\today{\ifcase\month\or
January\or February\or March\or April\or May\or June\or
July\or August\or September\or October\or November\or December\fi
\space\number\day, \number\year}
\baselineskip 14pt
\def\fnc#1{\mathop{\rm #1}\nolimits}
\rm
\line{\sevenrm a02.tex[154,phy] \today\hfill}
\def\dminus{\mathrel{\vbox{\lineskip1pt\baselineskip1pt
\halign{\ctr{$##$}\cr
.\cr-\cr}}}}
A {\it context-free grammar\/} (CFG) $G$ is a system for defining a
language over~$\Sigma↓G$. It uses a finite alphabet $V↓G$ which is
an extension of~$\Sigma↓G$; we define $N↓G=V↓G-\Sigma↓G$. It has a
{\it root symbol\/} $S↓G\in N↓G$. It has a finite set $P↓G$ of productions,
where a production is an element of $N\times V↑{\ast}$. We write
$A\tog x$ to indicate that $\langle A,x\rangle$ is a production;
we write $A\tog x↓1|x↓2\ldots|x↓n$ to indicate that
$\langle A,x↓1\rangle$, $\langle A,x↓2\rangle$, $\ldots$, $\langle A,x↓n\rangle$
are all productions. Thus $\tog$ is a relation from $N$ to~$V↑{\ast}$.
We extend it to a relation $\Rag$ from $V↑{\ast}$ to~$V↑{\ast}$,
where if $A\tog x$, $uAv\Rag uxv$. The relation
$\aRag$ is the reflexive transitive closure
of~$\Rag$. The language $L↓G$ defined by $G$ is
$\{\,x\mid S\aRag x\in\Sigma↑{\ast}\,\}$. Where
only one CFG is being discussed, we omit the subscript~$G$ and write
$V$, $\Sigma$, $P$, $S$, $→$, $\Rightarrow$, $\aRa$
and~$L$. A~language definable by a CFG we call a {\it context-free language\/}
(CFL).
An FAL can be defined by a CFG. Let $S=q↓0$, $N=Q$. Let the productions
be $q→aq'$, where $\delta(q,a)=q'$, and $q→a$ where $\delta(q,a)=q'\in F$.
Show by mathematical induction that if $q↓0\aRa x$,
either $\delta(q↓0,x)\in F$ or $x$ is $uq$ such that $\delta(q↓0,u)=q$.
Conversely, if a CFG $G$ is right linear (all productions belong to
$V\times(\Sigma↑{\ast}V∪\Sigma↑{\ast})$), then $L↓G$ is an~FAL.
\vfill\eject
\noindent{\bf String Axioms.}
An {\it alphabet\/} (usually called $\Sigma$) is a set, finite unless
otherwise stated, whose elements are called {\it characters\/}, or
occasionally symbols. {\it Strings\/} (over $\Sigma$) are finite
sequences of characters. The set of all strings over $\Sigma$ is
called~$\Sigma↑{\ast}$. A~physical realization of these definitions
must allow for distinguishing characters in all string contexts. The
axioms for strings are analogous to the Peano axioms for integers:
$$\vcenter{\halign{$\lft{#}\qquad$&\lft{#}\cr
\qquad\hbox{Integers}&\qquad Strings\cr
\noalign{\smallskip}
0\in N&\quad $\epsilon\in\Sigma↑{\ast}$\cr
&(the empty string $\epsilon$ is the empty sequence)\cr
x\in N⊃S(x)\in N&For each $a\in\Sigma$,\cr
&\quad $x\in\Sigma↑{\ast}⊃S↓a(x)\in\Sigma↑{\ast}$\cr
S(x)≠0&$S↓a(x)≠\epsilon$\cr
x≠y⊃S(x)≠S(y)&$x≠y⊃S↓a(x)≠S↓b(y)$\cr
&$a≠b⊃S↓a(x)≠S↓b(y)$\cr
\bigl(P(0)∧\bigl(\forall x.P(x)⊃P\bigl(S(x)\bigr)\bigr)\bigr)
&$P(\epsilon)∧\bigl(\forall x,a.P(x)⊃P\bigl(S↓a(x)\bigr)\bigr)$\cr
\qquad ⊃\forall x.P(x)&$\qquad ⊃\forall x\,P(x)$\cr
\noalign{\smallskip}
\hbox{I.e., there are no ``infinite''}&I.e., there are no ``infinite''\cr
\hbox{integers; all can be reached from}&strings.\cr
\hbox{zero in finitely many steps.}\cr}}$$
\smallskip
A {\it language\/} $L$ over $\Sigma$ is a subset of $\Sigma↑{\ast}$. It may
be empty, finite, infinite, etc. Languages over $\Sigma$ include~$\emptyset$,
the empty language; $\{\epsilon\}$,~the language containing only the
empty string, often abbreviated to~$\epsilon$ when no confusion can occur;
and $\Sigma↑{\ast}$ itself.
\vfill\eject
\noindent{\bf Exercise.}
Show that, if $f$ is a function from set~$S$ to set~$T$, the relation
$R$ on~$S$ defined by $(xRy)\equiv \bigl(f(x)=f(y)\bigr)$ is an
equivalence relation. Conversely show that if $R$ is an equivalence
relation on a set~$S$, there is a set~$T$ and a function~$f$ from
$S$ to~$T$ such that $xRy$ if and only if $f(x)=f(y)$.
\bigskip
\hrule
\bigskip
Automaton: A finite set of control states, a finite sequence of
{\it memory units\/}, a~transition function, a~start state (or a set
of start states), a~set of accepting states.
Memory units: a countable set of {\it contents\/} (without loss of
generality, non-negative integers), with one designated as the
{\it initial content\/} (say, zero), a~finite set of singulary
recursive {\it transition functions\/}, and a finite set of recursive
predicates called {\it tests\/}.
\smallskip\noindent
Examples of memory units:
\smallskip
\disleft 25pt:(1):
A counter. Contents are non-negative integers. Initial content is~0.
Transition functions are $\hbox{increment}(x)=x+1$,
$\hbox{decrement}(x)=x\dminus 1$. The test is $\hbox{zero}(x)≡(x=0)$.
\smallskip
\disleft 25pt:(2):
A stack. Contents are positive integers. Initial content is~1. Transition
functions are $\hbox{pushzero}(x)=2x$, $\hbox{pushone}(x)=2x+1$,
and $\hbox{pop}(x)=\lfloor x/2\rfloor$. The tests are
$\hbox{empty}(x)\equiv (x=1)$, $\hbox{topzero}(x)\equiv (x>1∧x\ \rm is\ even)$,
$\hbox{topzero}(x)\equiv (x>1∧x\ \rm is\ odd)$.
\smallskip
\disleft 25pt:(3):
A tape. Contents are $\langle\hbox{stack}\times (0,1)\times\hbox{stack}\rangle$;
transition functions are $\hbox{left}(x,y,z)=\langle\hbox{pushy}(x)$,
$\hbox{top}(z),\hbox{pop}(z)\rangle$, etc. The transition function of an
automaton is a function of control state and of all tests of the devices,
giving a new control state and a transition function (which may be the
identity function) for each device.
\medskip
\hrule
\medskip\noindent
Moore machine:
$$\vcenter{\baselineskip6pt\halign{$\ctr{#}$\qquad
&$\ctr{#}$\qquad
&$\ctr{#}$\qquad
&$\ctr{#}$\qquad
&$\ctr{#}$\cr
f&&g&&h\cr
&a&&b\cr
i&&j&&k\cr}}$$
\bigskip\noindent
Inputs label edges, but outputs label states. A computation with $n$ inputs
has $n+1$ outputs.
\medskip\noindent
Mealy machine:
$$\vcenter{\baselineskip6pt\halign{$\ctr{#}$\qquad
&$\ctr{#}$\qquad
&$\ctr{#}$\qquad
&$\ctr{#}$\qquad
&$\ctr{#}$\cr
&a/g&&b/h\cr
i&&j&&k\cr}}$$
\bigskip\noindent
A computation with $n$ inputs has $n$ outputs. A~Moore machine can be transformed
as shown above to a Mealy machine, omitting the (constant) first output.
A~Mealy machine can be transformed to a Moore machine as shown below:
$$\vcenter{\baselineskip6pt\halign{$\ctr{#}$\qquad
&$\ctr{#}$\qquad
&$\ctr{#}$\qquad
&$\ctr{#}$\qquad
&$\ctr{#}$\cr
&&g&&h\cr
&a&&b\cr
i,?&&j,g&&k,h\cr}}$$
\bigskip\noindent
Generalized sequential machines: a Mealy machine, without restricting inputs
and outputs to be of length~1, and without assuming determinatism.
\vfill
\hrule
\eject
\noindent
{\bf Theorem 1.} {\sl Two way NFAs accept only regular sets\/}.
\medskip
Let $M$ be a 2way nondeterministic FA. Define, for every string $x$, a set
of state pairs $P↓{LR}(x)=\{\,\langle i,j\rangle\mid$ if $M$ enters $x$
on the left in state~$i$, it may leave on the right in state~$j\,\}$.
Similarly, $P↓{LL}(x)=\{\langle i,j\rangle\mid$ if $M$ enters $x$ on
the left in state~$i$, it may leave on the left in state~$j\,\}$;
$P↓{RL}(x)$ and $P↓{RR}(x)$ are analogous for entering on the right. These
relations may be obtained recursively:
$$\eqalign{P↓{LR}(xy)&=P↓{LR}(x)\bigl(P↓{LL}(y)P↓{RR}(x)\bigr)↑{\ast}
R↓{LR}(y)\cr
P↓{LL}(xy)&=P↓{LL}(x)∪P↓{LR}(x)\bigl(P↓{LL}(y)P↓{RR}(x)\bigr)↑{\ast}
R↓{LL}(y)P↓{RL}(x)\,.\cr}$$
A one-way automaton with finite memory for $L↓M$ keeps track, when it
has read~$x$, of $P↓{LL}(x)$, $P↓{LR}(x)$, $P↓{RL}(x)$, and $P↓{RR}(x)$.
When it reads another character~$y$, it uses the above properties to get
$P↓{LL}(xy)$, et. At the end of the input, $x$~is accepted if $P↓{LR}(x)$
includes $\langle q↓0,p\rangle$ for some $p$ in~$F$.
\medskip
\hrule
\vfill\eject
Not only can a stack machine accept non-regular languages, it can accept
some regular languages using fewer states than any FA for those languages.
Here is a seven-state PDA for strings of length a multiple of~8.
$$\vcenter{\halign{\ctr{#}\quad&\ctr{#}\qquad&\ctr{#}\qquad\qquad&\ctr{#}\cr
$\cdot$\cr
\noalign{\bigskip}
Push $0$\cr
\noalign{\bigskip}
$\cdot$\cr
\noalign{\bigskip}
Push $0$\cr
\noalign{\bigskip}
$\cdot$\cr
\noalign{\bigskip}
Push $0$\cr
\noalign{\bigskip}
$\cdot$\cr
\noalign{\bigskip}
Read\cr
\noalign{\bigskip}
&$0$\cr
Pop&&&Push $1$\cr
\noalign{\bigskip}
$\cdot$&$1$\cr
\noalign{\bigskip}
&$1$\cr
Pop&&&Push $1$\cr
\noalign{\bigskip}
$\cdot$&$1$\cr
\noalign{\bigskip}
&$0$\cr
Pop&&&Push $1$\cr}}$$
\vfill\eject
Prefix equivalence classes for the infinite language $\{\,L↑iR↑i\mid i≥0\,\}$
\figbox 2truein:
\noindent
States can be labeled with the shortest string in the equivalence class:
$\epsilon$, $0$, $00$, $000$, $\ldots$, $01$, $001$, $0001$, $\ldots$, $010$.
\bigskip
\hrule
\bigskip\noindent
Myhill-Nerode Theorem.
For $L$ a language, define $f(x)=\{\,y\mid xy\in L\,\}$. If $L$ is
an FAL, $\delta(q↓0,x↓1)=\delta(q↓0,x↓2)$ implies $f(x↓1)=f(x↓2)$, and
since $\delta$ has finite range, $f$~has only finitely many values.
Define $x↓1\sim x↓2$ (``prefix equivalence'') if $f(x↓1)=f(x↓2)$;
clearly an equivalence relation. If $x↓1\sim x↓2$, then $x↓1z\sim x↓2z$;
this property is called right invariance. The language~$L$ is the union of
those equivalence classes $E(x)$ for which $\epsilon\in f(x)$. Then if
$L$ is an FAL, it is the union of some equivalence classes of a finite
right invariant equivalence relation on strings. (The converse is true
also.) If $f$ has infinite range, no FA can recognize~$L$, since
$x↓1\not\sim x↓2$ requires $\delta(q↓0,x↓1)≠\delta(q↓0,x↓2)$.
\bigskip
\hrule
\bigskip
Classify all regular sets as empty $(E)$, null string only $(N)$, other
finite~$(F)$, and infinite~$(I)$. Closure properties under union,
concatenation, and iteration are:
$$\vbox{\tabskip=0pt\offinterlineskip
\halign{$#$\quad&\vrule#&\strut\quad$#$&\quad$#$&\quad$#$&\quad$#$\qquad\qquad
&$#$\quad&\vrule#&\quad$#$&\quad$#$&\quad$#$&\quad$#$\qquad\qquad
&$#$\quad&\vrule#&\quad$#$&\quad$#$&\quad$#$&\quad$#$\cr
\circ&\omit&E&N&F&I&+&\omit&E&N&F&I&\ast&\omit&E&N&F&I\cr
&\multispan5\hrulefill\qquad\qquad&&\multispan5\hrulefill\qquad\qquad
&&\multispan5\hrulefill\cr
&height2pt&\omit&\omit&\omit&\omit&\omit&&\omit&\omit&\omit&\omit&\omit&%
&\omit&\omit&\omit&\omit\cr
E&&E&E&E&E&E&&E&N&F&I&&&N&N&I&I\cr
N&&E&N&F&I&N&&N&N&F&I\cr
F&&E&F&F&I&F&&F&F&F&I\cr
I&&E&I&I&I&I&&I&I&I&I\cr}}$$
\noindent
Since $\{\epsilon\}\in N$, $\emptyset\in E$, and $\{a\}\in F$, all regular
expressions can be recursively classified as one of the four types.
\bigskip
\hrule
\bigskip
\noindent Exercise:
Prove if context free language $L\subseteq a↑{\ast}b↑{\ast}$, there is a
CFG for $L$ with all productions of the form $A→\alpha B\gamma$ or
$A→\alpha$, $\alpha,\beta\in\Sigma↑{\ast}$; $A,B\in V$. [RWF rewrite]
\bigskip
\hrule
\vfill\eject
Palindromes over an alphabet of two or more characters are not regular.
\medskip
Proof: If they are regular, take a FA generator for them. Compose with a
filter for $a↑{\ast}ba↑{\ast}$, to get $\{a↑iba↑i\}$. Compose with the
obvious transducer to get $\{a↑ib↑i\}$, known not to be regular:
contradiction.
\bigskip\noindent
Exercise:
Define $L↓1/L↓2=\{\,x\mid xy\in L↓1, y\in L↓2\,\}$. Show if $L↓1$ is
regular, so is $L↓1/L↓2$, for {\it any\/} language~$L↓2$.
\vfill\eject\end